40 research outputs found
Zero-Shot Relation Extraction via Reading Comprehension
We show that relation extraction can be reduced to answering simple reading
comprehension questions, by associating one or more natural-language questions
with each relation slot. This reduction has several advantages: we can (1)
learn relation-extraction models by extending recent neural
reading-comprehension techniques, (2) build very large training sets for those
models by combining relation-specific crowd-sourced questions with distant
supervision, and even (3) do zero-shot learning by extracting new relation
types that are only specified at test-time, for which we have no labeled
training examples. Experiments on a Wikipedia slot-filling task demonstrate
that the approach can generalize to new questions for known relation types with
high accuracy, and that zero-shot generalization to unseen relation types is
possible, at lower accuracy levels, setting the bar for future work on this
task.Comment: CoNLL 201
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
We present TriviaQA, a challenging reading comprehension dataset containing
over 650K question-answer-evidence triples. TriviaQA includes 95K
question-answer pairs authored by trivia enthusiasts and independently gathered
evidence documents, six per question on average, that provide high quality
distant supervision for answering the questions. We show that, in comparison to
other recently introduced large-scale datasets, TriviaQA (1) has relatively
complex, compositional questions, (2) has considerable syntactic and lexical
variability between questions and corresponding answer-evidence sentences, and
(3) requires more cross sentence reasoning to find answers. We also present two
baseline algorithms: a feature-based classifier and a state-of-the-art neural
network, that performs well on SQuAD reading comprehension. Neither approach
comes close to human performance (23% and 40% vs. 80%), suggesting that
TriviaQA is a challenging testbed that is worth significant future study. Data
and code available at -- http://nlp.cs.washington.edu/triviaqa/Comment: Added references, fixed typos, minor baseline updat
Concise Answers to Complex Questions: Summarization of Long-form Answers
Long-form question answering systems provide rich information by presenting
paragraph-level answers, often containing optional background or auxiliary
information. While such comprehensive answers are helpful, not all information
is required to answer the question (e.g. users with domain knowledge do not
need an explanation of background). Can we provide a concise version of the
answer by summarizing it, while still addressing the question? We conduct a
user study on summarized answers generated from state-of-the-art models and our
newly proposed extract-and-decontextualize approach. We find a large proportion
of long-form answers (over 90%) in the ELI5 domain can be adequately summarized
by at least one system, while complex and implicit answers are challenging to
compress. We observe that decontextualization improves the quality of the
extractive summary, exemplifying its potential in the summarization task. To
promote future work, we provide an extractive summarization dataset covering 1K
long-form answers and our user study annotations. Together, we present the
first study on summarizing long-form answers, taking a step forward for QA
agents that can provide answers at multiple granularities.Comment: ACL 2023 Long Pape
RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation
Retrieving documents and prepending them in-context at inference time
improves performance of language model (LMs) on a wide range of tasks. However,
these documents, often spanning hundreds of words, make inference substantially
more expensive. We propose compressing the retrieved documents into textual
summaries prior to in-context integration. This not only reduces the
computational costs but also relieves the burden of LMs to identify relevant
information in long retrieved documents. We present two compressors -- an
extractive compressor which selects useful sentences from retrieved documents
and an abstractive compressor which generates summaries by synthesizing
information from multiple documents. Both compressors are trained to improve
LMs' performance on end tasks when the generated summaries are prepended to the
LMs' input, while keeping the summary concise.If the retrieved documents are
irrelevant to the input or offer no additional information to LM, our
compressor can return an empty string, implementing selective augmentation.We
evaluate our approach on language modeling task and open domain question
answering task. We achieve a compression rate of as low as 6% with minimal loss
in performance for both tasks, significantly outperforming the off-the-shelf
summarization models. We show that our compressors trained for one LM can
transfer to other LMs on the language modeling task and provide summaries
largely faithful to the retrieved documents